loads packages:

library(tidyverse)
library(gapminder)
library(here)
library(plotly)

Exercise 1: Explain the value of the here::here package

here::here gives you the ability to code a path name that is compatible across different operating systems. Different operating systems have different syntax for paths. The here::here package enables others to run your code without requiring them to rewrite the paths for their specific platform syntax. It does so by finding the root directory then the specific path desired in a format that is independent of the platform being used. The code will still run if you move the file since you do not need to change the relative directory. The code will still run if you open files outside of your Rstudio project. The here::here package is beneficial as it simplifies the use of sub-directories for projects.

Exercise 2: Factor management

BEFORE REORDER: Lets take the gapminder dataset and look continents to explore. Using the class function we can check that continents is indeed a factor, then check how many levels and what the levels are. The first plot shows the order of the levels. The arrange function will order continent by the factor levels.

class(gapminder$continent)
## [1] "factor"
nlevels(gapminder$continent)
## [1] 5
levels(gapminder$continent)
## [1] "Africa"   "Americas" "Asia"     "Europe"   "Oceania"
gapminder %>% 
  ggplot(aes(continent, pop, fill= continent)) +
  geom_col()

gapminder %>%
  arrange(continent) %>%
  DT::datatable()

REORDER: We will first drop Oceania from the dataset using filter in a new dataframe called gapminder1. …reorder the levels according to ascending population using fct_reorder function to arrage the levels in a variable called continent_relevel. In the new dataframe gapminder1 the continent_relevels replace the old levels. We can now check the class, number of levels, and levels of gapminder1, and plot the dataframe containing the new levels. The arrange function will order continent by the new factor levels.

gapminder1 <- gapminder %>%
  filter(continent == "Asia"| continent == "Americas" | continent ==  "Africa"| continent ==  "Europe") 

levels(gapminder1$continent)  #check levels after Oceania removal
## [1] "Africa"   "Americas" "Asia"     "Europe"   "Oceania"
gapminder1 <-gapminder1 %>%
  droplevels()
levels(gapminder1$continent) #check levels after Oceania removal and dropping unused levels
## [1] "Africa"   "Americas" "Asia"     "Europe"
continent_relevel <-fct_reorder(gapminder1$continent, gapminder1$pop, max) %>%
  levels()
gapminder1$continent <- factor(gapminder1$continent, levels = continent_relevel)
  
 #check class, number of levels, levels after Oceania removal and reorder
class(gapminder1$continent)  
## [1] "factor"
nlevels(gapminder1$continent)
## [1] 4
levels(gapminder1$continent)
## [1] "Europe"   "Africa"   "Americas" "Asia"
gapminder1 %>% 
  ggplot(aes(continent, pop, fill= continent)) +
  geom_col()

gapminder1 %>%
  arrange(continent) %>%
  DT::datatable()

Exercise 3: File input/output (I/O)

Lets take gapminder and select only 3 columns of country, continent, and population in a new dataframe called gapminder2. Lets then export it, then import it in another dataframe called gapminder3. We can add factor levels for continent and country in order of ascending population.

gapminder2 <- gapminder %>%
  select(country, continent, pop)
gapminder2 %>%
  DT::datatable()
write_csv(gapminder2, here::here("gapminder2.csv"))

gapminder3 <-read_csv(here::here("gapminder2.csv"))
gapminder3 %>%
  DT::datatable()
continent_relevel3 <-fct_reorder(gapminder3$continent, gapminder3$pop, max) %>%
  levels()
gapminder3$continent <- factor(gapminder3$continent, levels = continent_relevel3)
country_relevel3.1 <-fct_reorder(gapminder3$country, gapminder3$pop, max) %>%
  levels()
gapminder3$country <- factor(gapminder3$country, levels = country_relevel3.1)

# check levels after continent and country reordering
levels(gapminder3$continent)
## [1] "Oceania"  "Europe"   "Africa"   "Americas" "Asia"
levels(gapminder3$country) %>%
  head()
## [1] "Sao Tome and Principe" "Iceland"               "Djibouti"             
## [4] "Equatorial Guinea"     "Bahrain"               "Comoros"

The file was successfully exported and imported.

Exercise 4: Visualization design

Here is my sad assignment 2 plot:

ggplot(gapminder %>%
  filter( country == 'Canada'), aes(gdpPercap, lifeExp)) +
  geom_point(colour = 'blue') +
  scale_x_log10('log(gdpPercap)') 

Lets revamp it by adding a linear regression line, making the background white, adding new axis labels, and making it interactive with the plotly function… Looking fresh.

newPlot <-ggplot(gapminder %>%
  filter( country == 'Canada'), aes(gdpPercap, lifeExp)) +
  geom_smooth(method = "lm", colour = 'yellow') +
  geom_point(colour = 'blue') +
  scale_x_log10('log(gdpPercap)') +
  ylab("Life Expectancy") +
  xlab("log(GDP Per Capita)") +
  theme_bw()

newPlot %>%
  ggplotly()

Exercise 5: Writing figures to file

Lets save the last plot as a png.

ggsave("newPlot.png", width = 5, height = 3, units = "in")

Insert plot image:

pic

pic